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Abstract 

One of the important topics in oceanography is the prediction of 
ocean circulation. The goal of data assimilation is to combine the 
mathematical information provided by the modeling of ocean dynam- 
ics with observations of the ocean circulation, e.g. measurements of 
the sea surface height (SSH). In this paper, we focus on a particular 
class of extended Kalman filters as a data assimilation method: nudg- 
ing techniques, in which a corrective feedback term is added to the 
model equations. We consider here a standard shallow water model, 
and we define an innovation term that takes into account the measure- 
ments and respects the symmetries of the physical model. We prove 
the convergence of the estimation error to zero on a linear approxi- 
mation of the system. It boils down to estimating the fluid velocity 
in a water-tank system using only SSH measurements. The observer 
is very robust to noise and easy to tune. The general nonlinear case 
is illustrated by numerical experiments, and the results are compared 
with the standard nudging techniques. 

Keywords: Nudging, observer, symmetries, shallow water model, partial 
differential equation, estimation, data assimilation. 

1 Introduction 

Data assimilation consists in estimating the state of a system combining two 
different sources of information via numerical methods: models, and observa- 
tions. There is an increasing need for such methods in physical oceanography. 
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as the monitoring of the ocean provides crucial information about chmate 
changes [21], and the models, although very complex, need to be combined 
with the observations of ever increasing quality. Notably, the amount of data 
available in oceanography has drastically increased in the last years with the 
use of satellites. 

This paper deals with data assimilation, focusing on a particular tech- 
nique: nudging. The standard nudging algorithm consists in applying a 
Newtonian recall of the state value towards its direct observation [10] . From 
another point of view, the principle of nudging is to use observers of the Lu- 
enberger, or extended Kalman filter type for data assimilation [161 [12] • The 
model appears then as a weak constraint, and the nudging term forces the 
state variables to fit as well as possible the observations. This forcing term in 
the model dynamics has a tunable coefficient that represents the relaxation 
time scale. This coefficient is usually chosen by numerical experimentation so 
as to keep the nudging terms small in comparison with the state equations, 
and large enough to force the model to fit the observations. The nudging 
term can also be seen as a penalty term, which penalizes the system if the 
model is too far from the observations. The nudging method, or more gen- 
erally observers, is a flexible assimilation technique, computationally much 
more economical than variational data assimilation methods [2^ 115]. 

Observers of the Kalman fllter type are designed to provide, for each 
time step, the optimal estimate (i.e. of minimal error variance) of the system 
state, by using only the previous estimates of the state and the last obser- 
vations [III [9] . These fllters alternate propagation steps (with the physical 
model) and correction steps (using the observations) of the state and of its 
error statistics. In the case of a non-linear physical model the extended 
Kalman fllter only yields an approximation of the optimal estimate. As the 
oceanographic models have become very complex in the recent years, the 
high computing cost of the extended Kalman fllter can be prohibitive for 
data assimilation [21j . The nudging techniques take advantage of the form of 
the Kalman fllter, alternating propagation and correction steps but the gain 
matrices (or coefficients) are often chosen to be constant, and their expression 
requires very few (or no) calculations [TU1123]. 

Most Kalman-type filters, or Luenberger observers [IB] do not take into 
account the symmetries of the model. But the symmetries often contain 
useful geometrical information on the model that can help for the design 
and improve the performances. Indeed, there has been recent work on ob- 
server design and symmetries for engineering problems when the model is 
of finite dimension and when there is a Lie group acting on the state space 
[21 [H [13, m [6] . Symmetries provide a helpful guide to design correction terms 
based on the very structure of the physical system. The only difference be- 
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tween the observer and model equations comes from the correction term. 
When this term is bound to preserve symmetries, the observer is called "in- 
variant", or "symmetry-preserving". The result is that the estimations do 
not depend on arbitrary choices of units or coordinates, and the estimates 
share common physical properties with the system variables (in the examples 
given in estimated concentrations are automatically positive, estimated 
rotation matrices automatically belong to SO (3)). In some cases, the error 
system even presents very nice properties (autonomous error equation in [6]). 

This paper is an extension to the infinite-dimensional case of the recent 
ideas on observer design and symmetries for systems described by ordinary 
differential equations (ODE) [7], in the particular context of nudging for 
data assimilation in oceanography. The problem considered is the following: 
the ocean is described by a simplified shallow-water model. The sea surface 
height (SSH) is measured (with noise) everywhere by satellites. The goal is 
to estimate the height, and the (not measured) marine currents. The model 
considered is invariant by rotation and translation (S'i? (2)-invariance). In 
the case of problems described by infinite dimensional partial differential 
equations (PDE), the design of observers based on the symmetries of the 
physical system is new to the authors' knowledge. 

The main contribution of this paper is to derive a large class of observers 
based on the 5'i?(2)-invariance for this estimation problem. At first we are 
only interested in the structure of the observers, and we do not check the 
validity of the estimation. Then we pick a particular observer in this class. 
It has several remarkable properties as the design is based on physical con- 
siderations. The correction term does not depend on any non-trivial choice 
of coordinates. Moreover, it is made of a smoothing term which ensures re- 
markable robustness to white noise. The correction term consists in fact in 
a convolution of the estimation error with a smooth isotropic kernel. The 
idea to derive systematical smoothing terms based on physical symmetries is 
standard in image processing, and was initiated by [3j. One of the simplest 
method consists in a convolution with a smooth rotation-invariant kernel 
(isotropic diffusion based on the heat equation). But in this paper we com- 
bine the smoothing term with a dynamical model to provide an estimation of 
physical quantities which are not directly measured, i.e. we build an observer. 
The second main contribution of the paper is to yield a proof of convergence 
of the estimation error to zero on a simplified shallow water model. The proof 
shows how to handle the convolution term via a Fourier analysis. In fact we 
solve the following estimation problem: predicting the velocity in a water- 
tank using (very) noisy continuous measurements of the height everywhere 
at any time. Thus this paper can be viewed as a counterpart in the observer 
field of the work on the controllability of a water tank, which was raised by 
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[H] . Finally, a noticeable fact is that the observer depends on a small number 
of parameters, as the structure of the observer is imposed by the structure 
of the system. All these parameters admit a physical interpretation. So the 
observer gains are easy to tune, and the corresponding computing cost is 
quite low, contrarily to the extended Kalman filter. This property is in fact 
characteristic of symmetry-preserving observers [7] . The performances of the 
observer and the low computational cost have been numerically tested. 

The paper is organized as follows. In Section 2 we consider a bi-dimensional 
shallow water model, often used in geophysics for ocean or fluid flow model- 
ing. We define a class of invariant observers in the case of SSH observations. 
Then, we prove the convergence of the estimation error to zero on the first- 
order approximation of this system. Indeed we assume that the fluid motion 
is described by linearized wave equations under shallow-water approxima- 
tions. In Section 3, we report the results of extensive numerical simulations 
on both the linearized (and simplified) and nonlinear shallow water models. 
We show the interest of invariant observers in these cases. These results are 
also compared with the standard nudging technique, which can be seen as 
a particular situation of our symmetry-preserving observers. We also show 
their remarkable robustness to gaussian white noise on the observations. Fi- 
nally, some conclusions and perspectives are given in Section 4. 

2 Nudging and symmetries 

In this section we consider a simplified oceanic model. The state of the ocean 
is the SSH, and the horizontal speed of the marine currents. The choice of the 
orientation and the origin of the frame of used to express the horizontal 
coordinates (x, y) G is arbitrary: the physical problem is invariant by 
rotation and translation. Indeed from a mathematical point of view the 
Laplace operator A is invariant by rotation and translation. The first term 
of any observer for this problem is automatically invariant by rotation and 
translation, as it is a copy of the equations of the physical system. There is no 
reason why the correction term should depend on any non-trivial choice of the 
orientation and origin of the frame. It would yield correction terms giving 
more importance to the values of the height measured in some arbitrary 
direction of M^. In the general case, without additional information on the 
model, it seems perfectly logical indeed to correct the observer isotropically. 
This constraint suggests interesting correction terms. 
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2.1 Shallow water model 



The shallow water model (or Saint- Venant's equations) is a basic model, 
representing quite well the temporal evolution of geophysical flows. This 
model is usually considered for simple numerical experiments in oceanog- 
raphy, meteorology or hydrology [TH]. These equations are derived from a 
vertical integration of the three-dimensional fields, assuming the hydrostatic 
approximation, i.e. neglecting the vertical acceleration. We consider here 
the shallow water model of Jiang et al [Hj. For deeper water, this model can 
be adapted into a multi-layer model, each layer being described by a shallow 
water model, with some additional terms modeling stress and friction due to 
the other layers. 

The fluid is made of one layer of constant density p and with varying 
thickness (or height) h{x,y,t), covering a deeper layer of density p + Ap. 
The domain is rectangular: < x < L and < y < L, where x and y are 
the cartesian coordinates corresponding to East and North respectively. Let 
V be the corresponding gradient operator: 



+ (V ■ (hv) + (hv) ■ V)v = -g'hVh - k x f{hv) + (oa^V^ - R){hv) + ata«ri/p, 

(1) 



where hv = h{vxi + fyj) is the horizontal transport, with i and j pointing 
towards East and North respectively, / = /o + Piy — 0.5D) is the Coriolis 
parameter (in the /3-plane approximation), k is the upward unit vector, and 
g' is the reduced gravity. The ocean is driven by a zonal wind stress fi 
modeled as a body force, and f is known. Finally, R and A represent friction 
and lateral viscosity. 

We assume that the physical system is observed by several satellites that 
provide measurements of the SSH h{x, y, t) for all x, y, t. Within the frame- 
work of data assimilation for geophysical fluids, the goal is to estimate all 
the state variables v{x,y) and h{x,y) (velocity of the marine streams, and 
SSH respectively) at any point {x,y) G [0,L]^ of the domain. We finally 
consider that all the other parameters are known. Note that the state space 
is of infinite dimension. 




The equations write: 



dh 
'dt 



V ■ (hv) 



(2) 
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2.2 Model symmetries 

The unit vectors i and j are pointing East and North respectively. This choice 
is arbitrary, and the equations of fluid mechanics do not depend neither on 
the orientation nor on the origin of the frame in which the coordinates are 
expressed: they are invariant under the action of SE{2), the Special Euclidean 
group of isometrics of the plane M^. Let us prove this invariance. Let Rg = 

( ^^^f I be a horizontal rotation of angle 6. Let (xcVo) G be 

\^ sm6' cos6' y ^ ^ 

the origin of the new frame. Let (X, Y) be the coordinates associated to the 

new frame -R_e(i,j) — (xo,yo)- In this new frame, the variables read 

iX,Y) = Rgix,y) + {xo,yo), (3) 
HiX,Y) = hix,y), (4) 
ViX,Y) = Revix,y). (5) 

and i^,^) = Rei^,-^) which implies VH{X,Y) = ReVh{x,y). The 
equations in the new coordinates read the same with the notations K = k 
and I = Rgi: 

^iE}^ + (V ■ (HV) + (HV) ■ V)V = -g'HVH - K x f{HV) (6) 
ot 

+ {aAAV' - R){HV) + ataufl/p, 
^ = -^■(HV). (7) 

The Laplace and divergence operators are unchanged by the transforma- 
tion as they are invariant by rotation (although they are usually written in 
fixed coordinates, their value do not depend on the orientation of the chosen 
frame). Note that the square domain D = [0,L]^ C has to be replaced 
by the square {RqD + {xo,yo)) C M^. 

2.3 Symmetry-preserving nudging 

An observer for the system ([I])-(l2l) (nudging estimator) systematically writes 



+ {V ■ (hv) + (hv) ■ V)v = -g'hVh-kx f{hv) + {aAAV^ - R){hv) 

+atauri/ p + F^{h,v,h), (8) 

dh 

— = -^.^hv) + F,{h,v,h), (9) 
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with h = h on the boudary of the domain, and where the correction terms 
vanish when the estimated height h is equal to the observed height h: 

F,{h,v,h)=0, Fh{h,v,h) = 0. 

No additionnal condition is required on F^^ and Fh at this stage. They are 
functionals, and they can be for instance differential or integral operators of 
the space variables. The next step is to consider correction terms that respect 
the underlying physics of the system in the sense that they are invariant under 
the action of SE{2). Indeed why the quality of the estimation should depend 
upon the choice of orientation and origin of the frame when the physical 
system under consideration does not at all? Let us search the correction 
terms in the large class of 5'i?(2)-invariant differential and integral terms. 

Invariant correction terms 

We look for three types of invariant correction terms: scalar, differential, and 
integral. Let us first focus on the former two. Note that F^ is a vector and Fh 
a scalar. To find F^, we use the standard result (see e.g. [22j), which states 
that any 5'£'(2)-invariant scalar differential operator writes (5(A), where Q 
is a polynomial and A is the Laplacian. 

In our case, the coefficients of this polynomial can be invariant functions 
of the estimated state (/i, v) and measurement h. Note that v is the unique 
vectorial function amongst these variables. The scalar variables h and h 
are S'£^(2)-invariant when transformed accordingly to (jl]). Thus any scalar 
invariant function of (/i, h, v) must depend on v only via an invariant function 
of -O, typically \v\^. Indeed, using polar coordinates it is clear that it is 
the unique quantity that does not depend on the angle between v and the 
(arbitrary) axis of the frame (see e.g. [iHlIZj for the construction of a complete 
set of scalar invariants). But since differential terms are allowed, and the 
gradient V is a vectorial differential operator which rotates under the action 
of SE{2), one can get some rotation-invariant scalar terms from the scalar 
product between the gradient of invariant quantities, and the only vectorial 
function at our disposal v. Eventually, we get a complete family of scalar 
terms Fh : 

Fh = Qi{A, h, \v\\ h-h) + V (Q2{A, h, \v\\ h-h)yv + fh, (10) 

where Qi and Q2 are scalar polynomials in A, and fh is an integral term 
defined below. More precisely, for i = 1, 2, we have 

N 

Qi{A, h, \v\^, h-h) = J2 \v\^ h-h) A^ (bi{h, \v\^, h - h)'^ , (11) 

A;=0 
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where a], and 6^ are smooth scalar functions such that 



ai{h,\v\',0)^biih,\v\',0)^0. 



(12) 



For the vectorial correction term F^, we use the vectorial counterpart: 



where Pi and P2 are polynomials in A, like Qi and Q2. 

Let us now find the integral terms fh and that are invariant by rotation 
and translation. They can be expressed as a convolution between the pre- 
vious invariant differential terms and a two-dimensional kernel ij^{C,C)- The 
previous terms being invariant by rotation, the value of the kernel should 
not depend on any particular direction, and so ip must be a function of the 
invariant + (isotropic gain). So if we let (py and (ph be two real- valued 
kernels, the integral correction terms write: 



where the polynomials Ri and Si are defined like the Qj's. 

The support of (p^ (rcsp. (ph) is a subset of R. Its characteristic size 
defines a zone in which it is significant to correct the estimation with the 
measurements. The radial term -|- makes the observer independent of 
any arbitrary choice of orientation (invariance by rotation), and the use of a 
convolution makes the observer independent of the origin of the frame (in- 
variance by translation). The integral formulation is actually quite general: 
if (py and (ph are set equal to Dirac functions, one obtains the differential 
terms. 

Such integral correction terms make the observer robust to noise. Indeed 
if the kernels are smooth, the correction terms take into account the mea- 
surements, but are automatically smooth even if the measurements are not. 
The high frequencies in the signal are thus efficiently filtered. 




(13) 
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2.4 Convergence study on a linearized simplified sys- 
tem 

As it seems very difficult to study the convergence on the full non-linear 
system, we are going to simplify the system ([I])- ([2D in this section, and then 
linearize it around the equilibrium position h = h and v = v. In our ex- 
periments, the equilibrium is characterized by h equal to a constant height, 
and V = 0. The observer gains are designed on this latter system, and then 
we prove at the end of this section that they ensure the strong asymptotic 
convergence of the error. 

For reasons of clarity, we first consider a simplified shallow water model 
(Saint- Venant equations): 

§ = -v(H, (16) 

dv 

— = -v^v-g^h. (17) 

In order to avoid the amplification of the measurement noise by a differen- 
tiation process, only the integral correction terms are kept: one sets Qi = 
Q2 = Pi = P2 = 0, Ri = S2 = 0, and R2 = Si = h — h, and the ker- 
nels (ph and (py are supposed to be smooth functions. Thus the correction 
terms are smooth, and one can use the gradient of the output, as it is filtered. 
The symmetry-preserving observer ([H])-([n]) for the simplified system (|TU|) - (|T7|) 
writes: 

^ = -vChv) +JJ ue + C) {h - /^)(.-?,.-c,*) « 

= -VChv) + Vh*{h-h), (18) 
_ = -vVv -gVh+ / / 0,(e' + C') V{h - /i)(.-5,j,-c,t) « 

= -vVv - gVh + ip,*V{h- h), (19) 
where M^, = + C) and M^, Q = + C')- 

Remark: Note that in the degenerate case where (ph = KhSo and (p^ = K^Sq 
{Kfi and K^j are positive scalars), we find the standard nudging terms: 

dh 

= -Vihv) + Kj,{h-h), (20) 

at 

dv 

— = -v^v - gs/h + KyV{h-h). (21) 
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Using exactly the same simplifications as [20J which considers the control 
problem and motion planning of system ffTUl) - (fT7|) with boundary control, 
we study the first order approximation (or linearized approximation) of this 
system around the steady-state {h,v) = {h,0), where the equilibrium height 
h is constant. We only consider small velocities Sv = v — v <ti \/~^ and 
heights 5h = h — h ^h. Note that these approximations are consistent with 
the numerical experiments of Section [3l in which the ratio 5v (resp. 5h) to 
\/gh (resp. h) is of the order of 10~^ to 10~^. The linearized system is 

= -hV5v, (22) 



dt 
d{6v) 

dt 



-gV5h, (23) 



and the estimation errors, h = 6h — 6h and v = 6v — 6v, are solution of the 
following linear equations: 

dh - 

— = -hVv-iph*K (24) 
dv 

— = -gWh-ip,*Vh. (25) 

Eliminating v and using V{(pv * V/i) = ip^ * Ah yields a modified damped 
wave equation with external viscous damping: 

d^h - ~ - ~ dh 

— = ghAh + h!f„* Ah - ifh* -g^- (26) 

We now define the kernels ip^ and iph- We choose 

ip,{x,y) = {f{x)*f{x)){f{y)*f{y)), (27) 
Vh{x,y) = ig{x) * g{x)){g{y) * g{y)), (28) 

where / and g are smooth even functions. In order to respect the symmetries, 
both (p^{x,y) and iph{x,y) must also be functions of + y"^. The following 
gaussians respect both conditions: 

(Pvix,y) = /3^exp(-a^(x^ + y^)), (29) 
iPh{x,y) = Phexp{-ah{x^ + y^)), (30) 

Theorem 1. If ^p^ and iph are defined by ^29) and (E^) respectively with 
(3y, l3h,av,C(h > 0, then the first order approximation of the error system 
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around the equilibrium {h, v) = {h, 0) given by is strongly asymptotically 
convergent. Indeed if we consider the following Hilbert space and norm: 



n = H\n) X L^{n), \\{u,w)\\n=(^J\\^uf + \w\^y , (31) 



then, for every h solution of (1261) . 



lim 

t~*oo 



= 0. (32) 



This result proves the strong and asymptotic convergence of the error h 
towards 0, and then it also gives the same convergence for v. We deduce 
that the observer ( JT8l) -( fT9l) tends to the true state when time goes to infinity. 
Note that although only the height is observed, all variables are corrected, 
and estimated. 

A dimensional analysis can yield a meaningful choice of the gains. The 
parameters are expressed in m. They define the size of the regions 

of influence of the kernels, i.e. the region around any point in which the 
measured values of h are used to correct the estimation at the point. These 
value can be set experimentally using the data from the physical system. 
Moreover, concerning the tuning of (3^ and j3h, one can use the following 
heuristics. The error system (l26l) can be approximated by the following 
system, which corresponds to the case a = +oo: 

^ + 2^00.0^ = (^o^o)'A/.. (33) 

where l^uJl = gh + hPy, 2^oi^o = Ph, as long as we impose L^u^ > gh. [3^ and 
I3h can be chosen in order to control the characteristic pulsation cuq, length Lq, 
and damping coefficient of the approximated error equation (IH^ . These 
quantities have an obvious physical meaning and can be set accordingly to the 
characteristics of the physical system under consideration. Such heuristics 
provide a first reasonable tuning of the gains. 



Remark In the following proof, we only use the decomposition of the gain 
functions given by the general formulation (|271) and (l28l) . with some addi- 
tional assumptions on / and g (see end of section [231) . Thus, many other 
kernel functions than those given in fP^ - flHUl) lead to the same convergence 
result. 
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2.5 Proof of theorem [T] 

In this section, inspired by [H], we prove the strong convergence of the error 
system in the Hilbert space Ti. Let ifj^ = ghSo + hipv For simphcity reasons, 
we assume that L = tt. The error equation ( l26l) can be rewritten as a modified 
wave equation on a square domain with Dirichlet boundary condition: 

= ijj^* Au- Lfh* -§iu in M+ X = ]R+ X [0, nf, 
u = on M+ X dn, (34) 

u(0) = Mo, Ut{0) = ui in Q, 

where u{t,x,y) represents the estimation error h. The Dirichlet boundary 
condition comes from the fact that we set h = h on the boundary. 

We denote by (cpg) the following orthonormal basis of Hq{Q), composed 
of eigenf unctions of the unbounded operator A: 

2 

Cpg = - sm{px) sm{qy). (35) 

71 

As / and g are smooth functions {C°°{Q) for instance), we can consider 
their Fourier series expansion. Moreover, as / and g are even functions, 
their Fourier coefficients are real. If we denote by (/p) and [gp) the Fourier 
coefficients of / and g respectively, then the Fourier coefficients of are 
gh + hfpfg. Similarly, the Fourier coefficients of iph are (jpC/g. As all these 
coefficients are real and positive, we denote them by /^^ for and gp^ for 
!fh- The only assumption that we need on / and g in the following proof is 
that fp>0 and cjp > 0, Vp. Note that this condition is satisfied if / and g 
are defined in a such way that (fy and iph are given by (PU1) - (!HUI) . We now 
need the following intermediate result: 

Lemma 1. If uq E Hq{Q) and ui G L'^{VL), then equation flH^ has a unique 
solution satisfying 

ueC{R-Hl{n))r}C\R-L^{n)). (36) 

It is given by the series 

2 . 

u{t, x,y) = -Y] Upq{t) sm{px) sin(g?/), (37) 

TT ^ — ^ 

with either 

Upq{^) = (^~^\Apq cos{upgt) + Bpg Sm{Upgt)), (38) 

or 

Upg{t) = e^^\Apg cosh(cl'pgt) + Bpg sinh(cDpgt)). (39) 
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Moreover, the latter case appears at most for a finite number of indices, and 



^pq < —■ 

Proof. We rewrite equation flM|) as 

±U^AU, (40) 

where U = {u, Ut) and A is the following unbounded linear operator on Ti: 

A{u, w) := (w, ipv * Am — iph*w). (41) 
From fBTj) and flH^ . we deduce that 

= (42) 

are eigenvectors of A associated to the eigenvalues \±pq, solutions of 

Alp, + glq\±pq + flqip" + q') = 0. (43) 

Moreover, the family of eigenvectors {Epq) forms a Riesz basis of the Hilbert 
space Ti. The discriminant of (HHI) is Apg = (yf^^ — Aij)^ + q^)fpq. It can be 
positive for a finite number of indices only, since g^^ — * and /^^ > (^/i when 
p and g go to infinity. We found a Riesz basis of Ti formed by eigenvectors 
of A, the eigenvalues have no finite accumulation point and their real part 
are bounded. Thus all assumptions of theorem 3.1 of fT3] are satisfied: the 
solution U of (jlO]) is given by the series 



= E ^p'i'' ^ + ^-f^^ ^ 



pi} 



Apq<0 



+ E ^ * + f/-p,e ^ * Up,. (44) 



Finally, the coefficients can be found using the Fourier series of the initial 
condition. We have 
4 /■ 

\q = ^ u{0,x,y) sm{px) sm{qy) dxdy, (45) 



'[0,7r]: 

^pq = — I { ^' y) + %"(0' y) ) sin(px) sm{qy) dxdy{A6) 



□ 
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All we have to prove now is that the solution, which represents the es- 
timation error, converges to when time goes to infinity. Recall that the 
coefficients Upg are given by equation (l38|) . except for a finite number of 
indices. Define 



UN{t,x,y) 



TT ^ 

p+q>N 



\Apg cos{uJpgt) + Bpg sm{ujpgt)) slu (^x ) sin{qy). 



(47) 

Since Uq G Hq{Q) and Ui e L^(fi), Parseval's theorem tells us that for any 
e > 0, there exists such that 



... duN 

UN{t),-^{t) 



< e/2, Wt > 0. 



(48) 



H 



From ( l38l) and ( l39l) . there exists T > such that for any t >T, 



[u - UN){t) 



d{u — un) 
dt 



< e/2. 



(49) 



H 



Finally, Hm, -UfH-?^ < e for any t>T. We proved equation ( 132|) . i.e. the strong 
convergence of the linearized error system. 

Note that this proves the result for any kernel functions defined by (!27|) 
and (1^ . provided all Fourier coefficients g^^ are strictly positive. Note also 
that for iV > arbitrary large, from lemma [T], the truncated solution 
tends to exponentially in time. Thus exponential convergence is expected 
in numerical experiments. 



3 Numerical simulations 

In this section, we report the results of many numerical simulations on both 
the linearized and non-linear shallow water models, in order to illustrate 
the interest of such symmetry-preserving observers. The nonlinear observer 
is given by equations (fT^ - (|T^ . with gain functions iph and (p^ given by 
equations (!2U|) - (!Hn|) . Although the gain design has been done on the simplified 
linearized system, the gains are also implemented on the non-linear observer 
(on the full non-linear shallow water model). 

3.1 Linearized simplified system 

We first consider a shallow water model, in a quasi-linear situation (small 
velocities, and height close to the equilibrium height) given by equations 
(fT6l) - (fT711 . The corresponding observer is solution of equations (fT8l) - (fT9l) . 
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3.1.1 Model parameters 

The numerical experiments are performed on a square box, of dimension 2000 
km X 2000 km. The equilibrium height is h = 500 m, and the equilibrium 
longitudinal and transversal velocities are = Vy = m.s~^. We consider 
a regular spatial discretization with 81 x 81 gridpoints. The corresponding 
space step is 25 km. The time step is half an hour (1800 seconds), and we 
have considered time periods of 1 to 4 months (1440 to 5760 time steps). 

The reduced gravity is g = 0.02 ms'"^. The height varies between 497.7 
and 501.9 m and the norm of the transversal velocity is within the interval 
±0.008 m.s^^. The approximations of the preceding section are valid since 
V -C ^/gh = 3 m.s~^ and 6h <^ 500. The variations of the height and 
velocities are indeed of the order of 2 meters and 0.01 m.s~^ respectively. This 
kind of linearized system with the typical values above is often considered 
in geophysical applications, under the tangent linear approximation, for the 
estimation of an increment (instead of the solution itself) [5]. 

Concerning the tuning of the gains, we have considered the convolution 
kernels defined by equations (pU|) - (!Hn|) . Recall that and represent the 
characteristic size of the Gaussian kernel. We will always take = = 
a. In most of the experiments below we have a = 1 m~^. Unfortunately 
the weights jSh and (3^ cannot be chosen too large for numerical reasons, in 
order to avoid stability issues. So we always take jSh < 10~^. Recall that 
heuristically the error equation can be approximated by the damped wave 
equation ( !33l) with hp^j = L^u^ — gh and Ph = ^^o^o- The weights Ph and 
Py have two different units, and physical meaning, and (a priori) there is no 
physical reason why they should have approximately the same magnitude. 
Nevertheless, for the numerical values of Ph considered in this paper, one 
can check that any value < P^ < Ph yields a fundamental frequency for 
the error system loq^/I — which is close to the natural frequency \/gh/LQ 
of the physical system (fT6l) -( fT71) . From now on we will systematically set 
Pv = 0.1 Ph, which is acceptable from a physical point of view, also ensures 
the convergence of the observer, and is the largest value of P^ which yields 
numerical stability. Finally, a truncated convolution integral is used as an 
approximation of the complete convolution over the whole domain. The 
truncation radius is set equal to 10 pixels in our experiments (further than 
10 pixels away from its center the Gaussian can be viewed as numerical noise). 

We consider two criteria for quantifying the quality of the estimation 
process: the convergence rate of the estimation error, and the estimation 
error when convergence is reached. The initialization of the observer is always 

h = h{=hm), v = v{=Qi). 
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In all the following results, the estimation error is the relative difference 
between the true solution [h and v) and the observer solution {h and -O): 



— TT, jT, , e„ — n ^11 l^Uj 

— — f|| 

where || . {| is the standard norm on the considered domain. With the 
previously defined initialization of the observer, the estimation error at initial 
time is e/i(0) = 6^,(0) = 1, corresponding to a 100% error on the initial 
conditions. If we assume that the decrease rate is nearly constant in time, 
then the time evolution of the estimation error is given by: 

eh{t) = e/i(0) exp(-Cftt), e^{t) = 6^(0) exp{-c^t) , (51) 

where Ch and are the corresponding convergence rates. In all the numerical 
experiments that we have considered, the choice of the weighting coefficients 
and j3y does not modify the residual estimation errors at convergence. 
One observes that the convergence rates are linearly proportional to f3h (and 
to Py = O.ljSh), provided it is not too large. This is explained by formula 
(1371) as the Fourier coefficients depend linearly on Ph- 

3.1.2 Perfect observations 

We first assume that the observations are perfect, i.e. without any noise. 
Figure [1] shows the estimation error (in relative norm) versus time (number 
of time steps), for the three variables: height h, longitudinal velocity and 
transversal velocity Vy. The kernel coefficients are the following: 

(3h = 5.10*^ s-\ (3y = O.lph = 5.10"^ m.s-\ at = = 1 m~'^. 

This figure shows that the convergence speed is nearly constant in time, 
and equation ( l5Ti) is then valid. We can also deduce the corresponding con- 
vergence rates: 

Cf, = 7.57 X 10"^ c^, = 7.63 x 10"^ c^^ = 7.80 x 10'^. 

Another interesting point is that, although only the variable h is observed, 
the velocity v is also corrected, with a comparable convergence rate. This is 
predicted by the theory above, but it is nevertheless extremely noticeable, as 
in most data assimilation processes, only a few variables of the system are 
observed [10, EHl H]. This result shows (at least in the linear case) that all 
the variables are observable indeed. 
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Figure 1: Evolution of the estimation error in relative norm versus the num- 
ber of time steps, in the case of perfect observations, with a/j = = 1 
and Ph = 5.10^^ s^^, and with a 100% error on the initial conditions, for the 
three variables: height h, longitudinal velocity Vx and transversal velocity Vy. 
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Figure 2: Evolution of the estimation error in relative norm versus the num- 
ber of time steps, in the case of perfect observations, with ah — CKy — 1 m"'^ 
and Ph — 2.10"^ s~^, for the three variables: height h, longitudinal velocity 
Vx and transversal velocity Vy. 
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Figure [2] shows the resuhs of a similar experiment using different kernel 
coefficients {Ph = 2.10"^ and still = O.iPh, ah = = I m~^). 
The decrease rate is constant: Ch = 2.84 x 10~^, c^^ = 2.61 x 10~® and 
c^y = 2.91 X 10~^. The ratio between the decrease rate and f3h is almost 
preserved (the decrease rate has been multiplied by 3.5 to 3.75, and Ph by 4), 
as explained by formula fl37l) . From now on, we will only give the decrease 
rate corresponding to the following weight values: 

Ph = 5.10'^ s'\ p^ = O.lph = 5.10~® m.s-^. 

In both experiments, the estimation error at convergence has comparable 
(small) values: 

eh = 7.92 X 10-^ e^^ = 2.11 x 10-\ e^^ = 4.71 x 10~^ 

From a theoretical point of view, it should converge to 0. Several reasons 
explain this difference with the theory. The numerical non-linear system 
considered is not exactly described by its first-order approximation. Moreover 
the numerical schemes and numerical noise do not allow the observer solution 
to reach exactly the observed trajectory. Note that the small oscillations 
in the decrease of the estimation error can be explained by the oscillatory 
behavior described by ( 1371) . Numerically speaking, the fact that the model 
has nearly no diffusion (no theoretical diffusion, and almost no numerical 
diffusion) can also contribute to this oscillatory phenomena. 

Finally, we compare our observer to the standard nudging algorithm, by 
choosing a large value for ah and a^- Numerically we have set 

ah = = 1000 m~^. 

The decrease rate and estimation error at convergence are summarized in 
table [1] along with the previous results. The decrease rate of our observer is 
2.7 to 3 times bigger. But assuming the solution {h,v) is constant (which 
is nearly true), the convolution with a gaussian kernel of size 1 or with a 
dirac produces the same effect, with a vr factor (as J^2 e~*-^ Mxdy = vr). 
Numerically, the factor is a little bit smaller, as the solution is not constant. 
We also see that the estimation error at convergence is a little bit larger for a 
large. Numerically, the small difference can probably be explained by some 
numerical noise, which is smoothed by the convolution. 

3.1.3 Noisy observations 

We now assume that the height h cannot be observed properly, and instead of 
/i, we observe h + e where e represents the observation noise on h. We assume 
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Size of the 


Decrease rate 


Estimation error at convergence 


gaussian kernel 


(h. V-y. f,,) 


ih. v^. V,,) 




7.58 X 10-^ 


7.92 X 10^*^ 


ah = ay = 1 


7.63 X 10-^ 


2.11 X 10-^ 




7.80 X 10-^ 


4.71 X 10~^ 




2.49 X 10-^ 


1.02 X 10-^ 


ah = ay = 10^ 


2.61 X 10-^ 


2.65 X 10-^ 




2.87 X 10"^ 


6.12 X 10"^ 



Table 1: Decrease rate and value at convergence of the estimation error, for 
the three variables /i, Vx and Vy, for two different sizes of the gaussian kernel. 
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Figure 3: Evolution of the estimation error in relative norm versus the 
number of time steps, in the case of noisy observations (20% noise), with 
ah = ay = 1 and [3h = 2.10"^ for the three variables: height h, 
longitudinal velocity and transversal velocity Vy. 



that e is gaussian with zero mean (white noise is standard in oceanography 
[9]), and a standard deviation of 20 to 40% of the standard deviation of the 
height h. Thus a 0.2 estimation error means that the estimated value h is 
closer to the true height h than to the observed height h + e. Figure shows 
similar experiments as previously described, in the case of noisy observations, 
for (3h = 2.10"^ s^^ and a = 1 m~^. The global behaviour of the solution 
is unchanged (constant decrease until stabilization). The decrease rate and 
value at convergence of the estimation error for a = 0.5 m~^, 1 and 10^ are 
summarized in table [21 

There is still a ratio of nearly vr between the decrease rate for a large and 
a = 1 m~^. a = 0.5 seems to be an optimal value for the parameter a: 
it is large enough to smooth efficiently the noise, and we checked that the 
decrease rate is not much larger when we take smaller values of a. Thus we 
see it is useless to correct the estimation at one point with values of h which 
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Size of the 


Decrease rate 


Estimation error at convergence 


gaussian kernel 


(h. V-y. f,,) 


ih. v^. V,,) 




1.49 X 10^^ 


4.43 X 10"^ 


Q/f^l = Cty = 0.5 


1.40 X 10"*^ 


7.51 X 10"^ 




1.42 X 10"^ 


4.06 X 10"^ 




7.55 X 10-^ 


5.92 X 10-3 


ah = ay = 1 


7.44 X 10-^ 


1.04 X 10-2 




7.44 X 10"^ 


5.53 X 10-3 




2.45 X 10-^ 


1.70 X 10-2 


ah = ay = 10^ 


2.49 X 10-^ 


3.02 X 10-2 




2.48 X 10-^ 


1.59 X 10-2 



Table 2: Decrease rate and value at convergence of the estimation error, 
for the three variables /i, Vx and Vy, for three different sizes of the gaussian 
kernel, in the case of noisy observations (20% noise). 



are too far away from this point. In comparison with the case of perfect 
observations, the decrease rate is remarkably unaffected by the presence of 
noise. 

The estimation error at convergence is much larger than in the case of 
perfect observations. Nevertheless, all variables have been identified with 
less than 1% of error. We see the interest of the convolution as the error at 
convergence is 3 to 4 times smaller with a ~ 1 than with a = 1000. This is 
due to the fact that the term V{h — h) is very noisy when it is not directly 
filtered, as it is the case in the standard nudging algorithm (or extended 
Kalman filter). 

3.2 Full nonlinear shallow water model 

We now consider the full shallow water model, with the Coriolis force, fric- 
tion, lateral viscosity, and wind stress (see equations ([I])-(l2])). We also con- 
sider large velocities and height variations, with still the same equilibrium 
point: h = 500, = Vy = 0. The size of the domain and the time and space 
steps remain the same as in the previous experiments (see section [3. l.ip . the 
other physical parameters being: 

/o = 7.10-^s-\ P = 2.l0-^^m-\s'\ = 9.10-^ A = 5m'^.s-\ w = 0.05 5-2. 

As we will see at the end of this section, this model reproduces quite well 
the evolution of a fluid in the northern hemisphere (e.g. Gulf Stream, in the 
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Figure 4: Evolution of the estimation error in relative norm versus the num- 
ber of time steps, in the case of perfect observations, for Ph = 5.10"'' and 
«/i = = 0.5 m~^, for the three variables: height h, longitudinal velocity 
Vx and transversal velocity Vy. 



case of the North Atlantic ocean, double-gyre circulation, . . . ), with realistic 
velocities and dimensions p!9] . 

3.2.1 Perfect observations 

We consider the same convolution kernels as in the experiments on the ap- 
proximated system above, with the same reference parameters f3h = 5.10"^ 
and f3y = O.iPh- Figure H] shows the estimation error (in relative norm) 
versus time (number of time steps), for the three variables: h,Vx,Vy, with 
ah = Ciy = 0.5 m~^. Similar curves have been obtained with other values of 
a. The convergence speed for h,v are constant only at the beginning, and 
decrease continuously to after the error goes under some treshold. 

Table [3] summarizes the decrease rates at the beginning and the resid- 
ual estimation error. The final estimation error is much larger than in the 
previous experiments. Consequently, if the velocity is not well retrieved, the 
height cannot be perfectly identified. Nevertheless the height estimation er- 
ror is close to 1%, which is a very good result, considering the high turbulence 
of the model. The velocity is partially identified (with 12 to 15% of error 
in the best situations). The convergence rates are a little bit larger than in 
the linearized case. The behaviour between the standard gaussian convolu- 
tion (a = 1 m~^) and the Dirac convolution {a = 10^ m~^) is comparable 
to the previous experiments. Note that the nonlinear model has a diffusion 
term, and hence regularizes much more the solution than the linearized model 
without diffusion. It explains why there are no oscillation as in the previous 
cases. 
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Size of the 


Decrease rate 


Estimation error at convergence 


gaussian kernel 


(h. V-r.. f,,) 


(h. V-r-. v,,\ 




2.39 X 10"^ 


3.73 X 10^2 


Qij^ = Qi^ = 0.5 


1.79 X lO"*^ 


2.68 X 10"^ 




1.48 X 10"^ 


3.47 X 10"^ 




1.29 X 10-^ 


8.99 X 10-3 


ah = ctv = i 


1.07 X 10-^ 


1.24 X 10-1 




8.58 X 10"^ 


1.39 X 10-1 




4.45 X 10-^ 


1.46 X 10-2 


ah = ay = xlO^ 


3.19 X 10-^ 


1.70 X 10-1 




2.81 X 10-^ 


2.16 X 10-1 



Table 3: Decrease rate and value at convergence of the estimation error, 
for the three variables /i, Vx and Vy, for three different sizes of the gaussian 
kernel. 



3.2.2 Noisy observations 

As in the linearized situation, h + e \s measured, where e is assumed to be 
white. In our experiments, the standard deviation of e is nearly 20% of the 
standard deviation of h (around the equilibrium state h = 500). 

The estimation error in the case of noisy observations is nearly 1.5 times 
larger than for perfect observations, both for a = 1 mr"^ and a = 10^ m~'^. 
This confirms the relative insensitivity of the observer with respect to the 
presence of observation noise, as the level of noise is 20%, and the estimation 
errors are nearly 2% for h and 13 to 30% for the velocity. In this case, the 
best results have been obtained for a = 1 m~'^, improving the results of the 
nudging algorithm {a = 10^ ■m~'^) of 33 to 50%. These results clearly show 
the interest of a gaussian kernel applied to the correction term, in order to 
smooth the noisy observations (or the numerical noise). 

Finally, figures [6] and [7] illustrate the identification process for both the 
height and velocity in the case of noisy observations, for ah = a^ = 1 (second 
case of table H]). We do not use any a priori information, as the initial guess is 
h = h = 500 meters (top left image of figure E]), and v = v = m.s~^ . Figure 
[H] shows on the top right the noisy observation h + e oi the height at the final 
time T = 1400 time steps. It should be compared to the bottom right image, 
showing the true height h at the same time. The difference between these 
two images corresponds to the white Gaussian noise e. Finally, the identified 
height (i.e. the observer h at final time T) is shown on the bottom left image 
of figure El These images confirm both the very good identification of the 
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Figure 5: Evolution of the estimation error in relative norm versus the 
number of time steps, in the case of noisy observations (20% noise), for 
Ph = 5.10"^ and ah = = 10^ m~^, for the three variables: height h, 
longitudinal velocity and transversal velocity Vy. 



Size of the 


Decrease rate 


Estimation error at convergence 


gaussiaii kernel 


(/'■ l\r- 


(/'■ '',,/) 




2.74 X IQ-s 


1.71 X 10-2 


(Xh — Oiv — 0.5 


1.87 X 10-6 


1.72 X 10-1 




1.62 X 10-6 


2.21 X 10-1 




1.36 X 10-6 


1.57 X 10-2 


= = 1 


9.65 X 10-^ 


1.30 X 10-1 




8.38 X 10"^ 


1.59 X 10-1 




4.42 X 10-^ 


2.26 X 10-2 


a/i = ay = 10^ 


2.98 X 10-^ 


2.25 X 10-1 




2.55 X 10-^ 


3.04 X 10-1 



Table 4: Decrease rate and value at convergence of the estimation error, 
for the three variables h, Vx and Vy, for three different sizes of the gaussian 
kernel, in the case of noisy observations (20% noise). 
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Figure 6: Identification process for tlie lieight, in meters: initial guess {h{0) = 
h); noisy observation at final time {h{T) + e, with T = 1440 time steps); 
identified height at final time {h{T)); true height at final time {h{T)). 
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Figure 7: Identification process for tlie velocity, in m.s~^: identified longitu- 
dinal (resp. transversal) velocity at final time {v{T)); true longitudinal (resp. 
transversal) velocity at final time (f (T)). 
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height (as previously seen in table H]) and the noise removal. 

Figure [7] shows the identified and real components of the velocity. Note 
that even if there are no observations of the velocity, the observer v is very 
close to the real velocity v at time T. This is usually not the case in standard 
nudging techniques, where only observed variables are corrected [IDl ES, H] . 
The main current (correponding to the Gulf Stream, in the case of the North 
Atlantic ocean) is very well identified. This is extremely interesting, as in 
real geophysical applications, there are also almost no observations of the 
fiuid velocity, although it has to be identified precisely |;5j . From table HI we 
have previously seen that the error on the velocity is nearly 15% in this case, 
which is quite high. But the main currents are very well identified, and this 
is a key-point for improving the quality of the forecasts. In most geophysical 
data assimilation problems, the non-observed variables are only corrected 
thanks to model coupling, and it does not lead to such a nice identification 

4 Conclusion 

In this paper, we have defined a class of symmetry-preserving observers for a 
simplified and linearized shallow water model. We proved the convergence to 
zero of the error (i.e. difference between the observer and real trajectories) 
when time goes to infinity. Many numerical simulations show the interest of 
such a choice of invariant gains. This paper gives insight in the field of non- 
linear observers for infinite dimensional systems, where very few methods are 
available. 

The observer provides better results than the standard nudging, even 
on the nonlinear system, as the error converges faster, the residual error is 
smaller, and the observer is much more robust to noise. The correction terms 
used in this paper are different from those of the usual extended Kalman 
filter-type estimators (no integral over space is performed). Our observer 
has several advantages upon the extended Kalman filter. First the compu- 
tation cost is much smaller (as long as the gaussian kernel is set equal to 
zero wherever its value is negligible, see Section [H]). This is important as in 
oceanographic data assimilation, the computation cost of the Kalman filter 
can be prohibitive, as well as the cost of optimal nudging techniques |24] . 
Moreover the tuning of the gains of our observer is very easy as they depend 
on a very reduced number of parameters which have a physical meaning. It 
is precisely the use of the physical structure of the system which allows us 
to reduce the degrees of freedom in the gain design. Finally, to the author's 
knowledge, there is no proof of convergence of the Kalman filter for infinite 
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dimensional non-linear systems. Note that we also showed, both on theoret- 
ical and numerical points of view, that thanks to an appropriate choice of 
observer, it is possible to correct very well the non-observed variables with the 
observed ones, which is still a challenge in oceanography (and more generally 
in geophysics) [5]. 

Other types of correction terms could be used. In particular one could de- 
fine the correction term in order to get the following linearized error equation 
instead of (!26l) : 



Indeed an additional structural damping would drastically change the spec- 
trum. The use of such terms is generally prohibited by the presence of 
measurement noise, but the convolution product with a smooth kernel would 
allow us to use them. Another direction for future work would be to make 
numerical experiments on back and forth nudging based on our observer. The 
observer can easily be adapted in reverse time indeed, with ^ph ^ —'^h and 
(py unchanged (see e.g. ^ for details about this data assimilation method). 
Finally, in this paper we only considered time and space continuous mea- 
surements. Some experiments will also be carried out in the case of sparse 
observations, both in time and space. As a more general concluding remark, 
although this paper is only concerned with examples, it shows a system- 
atical way to take advantage of the rotational invariance of the Laplacian, 
and yields a method for the convergence analysis. This technique could be 
adapted to other estimation problems from physics and engineering, where 
the models are based on the wave equation or on the heat equation. 
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